Skip to content

Conversation

@Enjection
Copy link
Owner

This document describes the architecture and implementation approach for
extending incremental restore functionality to support global indexes,
complementing the backup support already implemented in branch
feature/incr-backup/indexes-support-003.### Changelog entry

...

Changelog category

  • New feature
  • Experimental feature
  • Improvement
  • Performance improvement
  • User Interface
  • Bugfix
  • Backward incompatible change
  • Documentation (changelog entry is not required)
  • Not for changelog (changelog entry is not required)

Description for reviewers

...

@coderabbitai
Copy link

coderabbitai bot commented Nov 7, 2025

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Summary by CodeRabbit

Release Notes

  • New Features

    • Global indexes now support incremental restore alongside table data for seamless recovery.
    • Parallel restoration of tables and indexes for improved performance.
    • Enhanced error handling for index restore scenarios, including graceful handling of missing indexes and schema mismatches.
  • Tests

    • Comprehensive test coverage for incremental backup and restore scenarios with various index configurations and table setups.

Walkthrough

Adds index incremental-restore support: discovers backed-up global indexes, maps backup-relative index paths to restore targets, creates per-index restore operations in parallel with table restores, registers index ops in unified in-progress tracking, and finalizes activation atomically once all ops complete.

Changes

Cohort / File(s) Summary
Design & Plan
INCREMENTAL_RESTORE_INDEXES_PLAN.md
New implementation plan describing discovery, mapping helpers, per-index restore operation creation, unified progress tracking, phased rollout, error handling, and test strategy.
Header Declarations
ydb/core/tx/schemeshard/schemeshard_impl.h
Adds private TSchemeShard declarations: DiscoverAndCreateIndexRestoreOperations, DiscoverIndexesRecursive, CreateSingleIndexRestoreOperation, and FindTargetTablePath for orchestrating index restores.
Core Logic
ydb/core/tx/schemeshard/schemeshard_incremental_restore_scan.cpp
Extends CreateIncrementalRestoreOperation to invoke index discovery; implements FindTargetTablePath, DiscoverIndexesRecursive, DiscoverAndCreateIndexRestoreOperations, and CreateSingleIndexRestoreOperation; constructs and sends per-index ESchemeOpRestoreMultipleIncrementalBackups modify-scheme requests; integrates index ops into IncrementalRestoreState and TxIdToIncrementalRestore tracking; supports parallel submission of table and index restores.
Unit Tests
ydb/core/tx/datashard/datashard_ut_incremental_backup.cpp
Adds many new end-to-end unit tests covering incremental backups/restores with indexes: single/multiple indexes, multi-table scenarios, multiple incremental backups, indexImplTable verification, and data/index integrity checks.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant TSchemeShard
    participant CreateInc as CreateIncrementalRestoreOperation
    participant DiscoverIdx as DiscoverAndCreateIndexRestoreOperations
    participant CreateIdx as CreateSingleIndexRestoreOperation
    participant Tracking as InProgressOperations

    User->>TSchemeShard: Request Incremental Restore
    TSchemeShard->>CreateInc: Prepare table restore operations
    CreateInc->>Tracking: Register table ops

    Note over CreateInc: Parallel index discovery & creation (new)
    CreateInc->>DiscoverIdx: Discover backed-up indexes
    DiscoverIdx->>DiscoverIdx: Scan indexes metadata recursively
    loop For each discovered index
        DiscoverIdx->>CreateIdx: Create per-index restore op
        CreateIdx->>CreateIdx: Validate target, locate indexImplTable, form restore request
        CreateIdx->>Tracking: Register index op
    end

    rect rgba(200,220,255,0.25)
        Note over Tracking: Unified tracking for table & index operations
    end

    Tracking->>TSchemeShard: All ops complete → Finalize (atomic activation)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • Multiple new functions with intertwined discovery, path-mapping, and operation-creation logic.
  • Cross-cutting edits: headers, core orchestration, and extensive tests.
  • Pay special attention to:
    • Path resolution in FindTargetTablePath.
    • Correct detection of global indexes and determination of indexImplTable.
    • Consistency of IncrementalRestoreState and TxIdToIncrementalRestore mappings.
    • Handling of OmitIndexes, missing index metadata, and schema mismatches.
    • Stability of added tests in multi-shard scenarios.

Poem

🐰 I dug through backups, sniffed each index trail,

Mapped tiny paths and set each hop to sail,
Hopped side-by-side with tables, swift and keen,
Tracked every move until the scene was green,
A rabbit’s cheer — restores complete and hale.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 13.33% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title check ❓ Inconclusive The title 'WIP' is too vague and does not clearly convey the main purpose of the changeset, which is to implement incremental restore support for global indexes. Update the title to be descriptive of the actual change, such as 'Implement incremental restore support for global indexes' or similar.
✅ Passed checks (1 passed)
Check name Status Explanation
Description check ✅ Passed The description mentions the architecture and implementation approach for extending incremental restore functionality to support global indexes, which aligns with the actual changeset content.

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Comment @coderabbitai help to get the list of available commands and usage tips.

@Enjection
Copy link
Owner Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Nov 7, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 49e024a and 9bf4a6d.

📒 Files selected for processing (4)
  • INCREMENTAL_RESTORE_INDEXES_PLAN.md (1 hunks)
  • ydb/core/tx/datashard/datashard_ut_incremental_backup.cpp (1 hunks)
  • ydb/core/tx/schemeshard/schemeshard_impl.h (1 hunks)
  • ydb/core/tx/schemeshard/schemeshard_incremental_restore_scan.cpp (1 hunks)

@Enjection
Copy link
Owner Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Nov 7, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@Enjection
Copy link
Owner Author

@coderabbitai full review

@coderabbitai
Copy link

coderabbitai bot commented Nov 7, 2025

✅ Actions performed

Full review triggered.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
ydb/core/tx/datashard/datashard_ut_incremental_backup.cpp (1)

2522-3124: Consider factoring common restore scaffolding

The five new Y_UNIT_TEST blocks all replay the same backup/restore choreography (create tables + indexes → run full backup → run one or more incremental backups → drop tables → RESTORE → repeat the same verification queries), with only the table shape or number of indexes varying. Could we pull the shared setup/restore/assert steps into helpers (e.g. a RunIncrementalIndexScenario that takes table/index definitions and expected rows)? That would collapse a few hundred duplicated lines and make future scenarios cheaper to add.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 49e024a and d520a69.

📒 Files selected for processing (4)
  • INCREMENTAL_RESTORE_INDEXES_PLAN.md (1 hunks)
  • ydb/core/tx/datashard/datashard_ut_incremental_backup.cpp (1 hunks)
  • ydb/core/tx/schemeshard/schemeshard_impl.h (1 hunks)
  • ydb/core/tx/schemeshard/schemeshard_incremental_restore_scan.cpp (1 hunks)
🔇 Additional comments (3)
ydb/core/tx/schemeshard/schemeshard_impl.h (1)

1197-1225: Header stays in lockstep

Thanks for keeping the private declarations in the header in sync with the new helpers — that keeps navigation between declaration/definition straightforward.

INCREMENTAL_RESTORE_INDEXES_PLAN.md (1)

43-208: Implementation plan hits the right checkpoints

Appreciate that the plan calls out the phased rollout (“discover → helper → progress → finalize → tests”). That matches the code changes and makes the review a lot easier to follow.

ydb/core/tx/schemeshard/schemeshard_incremental_restore_scan.cpp (1)

611-860: Great to see indexes wired into incremental restore flow

Really happy to see DiscoverAndCreateIndexRestoreOperations sharing the same state accounting as the table path (especially adding the index shard set into ExpectedShards). That lets the existing progress tracking/finalization machinery work without special-casing index jobs.

This document describes the architecture and implementation approach for
extending incremental restore functionality to support global indexes,
complementing the backup support already implemented in branch
feature/incr-backup/indexes-support-003.
This commit adds support for restoring global indexes during incremental
restore operations, extending the existing table-only restore functionality.

Key changes:

1. Index Discovery (DiscoverAndCreateIndexRestoreOperations):
   - Scans __ydb_backup_meta/indexes directory in backup
   - Discovers index backups for each table
   - Validates OmitIndexes flag from backup config

2. Index Restore Operations (CreateSingleIndexRestoreOperation):
   - Creates parallel restore operations for index implementation tables
   - Validates indexes exist on target tables
   - Filters for EIndexTypeGlobal indexes only (matches backup behavior)
   - Tracks operations same as table operations for unified completion

3. Path Mapping (FindTargetTablePath):
   - Maps relative backup paths to target table paths
   - Uses backup collection's ExplicitEntryList for mapping

Architecture:
- Tables and indexes restore IN PARALLEL (no artificial dependencies)
- Both tracked in same InProgressOperations set
- Atomic activation when ALL operations complete (via existing finalization)
- Gracefully handles missing indexes and OmitIndexes flag

The implementation reuses existing restore operation types and tracking
mechanisms, requiring no changes to proto definitions or finalization logic.

Related to: feature/incr-backup/indexes-support-003
This commit adds 5 comprehensive test cases covering all aspects of
incremental restore for global indexes:

1. BasicIndexIncrementalRestore:
   - Tests single global index restore
   - Verifies index functionality after restore
   - Tests querying via index after restoration

2. MultipleIndexesIncrementalRestore:
   - Tests table with 3 global indexes
   - Verifies all indexes restore correctly
   - Validates parallel index restore functionality

3. IndexDataVerificationIncrementalRestore:
   - Multi-shard table with index
   - Tests INSERT, UPDATE, DELETE operations across shards
   - Verifies index data integrity after restore
   - Compares before/after restore index query results

4. MultipleIncrementalBackupsWithIndexes:
   - Tests sequence of 3 incremental backups
   - Each backup has different operations (add, update, delete)
   - Verifies indexes restored correctly across all increments

5. MultipleTablesWithIndexesIncrementalRestore:
   - Tests backup collection with 2 tables
   - Each table has different global index
   - Verifies parallel restore of multiple table+index combinations

Test Coverage:
- Basic index restore ✓
- Multiple indexes per table ✓
- Multi-shard index tables ✓
- Index data verification ✓
- Multiple incremental sequences ✓
- Multiple tables with indexes ✓
- Parallel completion ✓ (implicitly tested in all tests)
- Index queries after restore ✓

Total: 506 lines of test code added
Enhanced all 5 index restore tests to directly query index implementation
tables, providing stronger verification that indexes are correctly restored.

Changes per test:

1. BasicIndexIncrementalRestore:
   - Query indexImplTable directly to verify structure
   - Count rows in index impl table (should be 4)
   - Verify specific indexed_col values (100, 250, 300, 400)

2. MultipleIndexesIncrementalRestore:
   - Query all 3 index impl tables (index1, index2, index3)
   - Verify row counts for each (all should have 4 rows)
   - Spot-check index3 impl table data

3. IndexDataVerificationIncrementalRestore:
   - Query age_index impl table directly
   - Verify correct data after INSERT/UPDATE/DELETE operations
   - Confirm deleted records NOT in index impl table
   - Verify row count (should be 4 after deletions)

4. MultipleIncrementalBackupsWithIndexes:
   - Query idx impl table after 3 incremental backups
   - Verify old values removed (100, 200)
   - Verify new/updated values present (250, 300, 400)
   - Confirm final row count (should be 3)

5. MultipleTablesWithIndexesIncrementalRestore:
   - Query both idx1 and idx2 impl tables
   - Verify row counts for both (should be 3 each)
   - Spot-check data in both index impl tables

Benefits:
- Stronger verification that index tables were actually restored
- Tests verify index table structure is intact
- Validates index data matches expected state
- Ensures index impl tables are queryable after restore
- Confirms parallel restore of multiple indexes

Pattern used: `/Root/{table}/{index}/indexImplTable`
Key insight: Index columns appear first, then main table key columns

Total: +98 lines of enhanced test verification
Addresses two critical PR review comments:

1. Fix nested table path discovery (CRITICAL BUG):
   - Previous: Only looked 1 level deep in __ydb_backup_meta/indexes/
   - Impact: Tables in subdirectories (e.g. /Root/Dir/Foo) had indexes silently skipped
   - Fix: Implemented recursive DiscoverIndexesRecursive() that:
     * Traverses the full directory tree under indexes/
     * Accumulates relative path as it descends (Root -> Root/Dir -> Root/Dir/Foo)
     * Matches against target tables at each level
     * Creates index operations when match found

   Example:
     Backup structure: __ydb_backup_meta/indexes/Root/Dir/Foo/index1
     Old behavior: Only saw "Root", failed to match, skipped all indexes
     New behavior: Descends to "Root/Dir/Foo", matches table, restores indexes

2. Fix path matching to prevent false positives:
   - Previous: itemPath.EndsWith(relativeTablePath)
   - Problem: /Root/FooBar matched "Bar", /Root/users_table matched "table"
   - Fix: Only accept exact match OR suffix preceded by "/"
     * itemPath == relativeTablePath
     * itemPath.EndsWith("/" + relativeTablePath)

Changes:
- Added DiscoverIndexesRecursive() helper for recursive directory traversal
- Updated FindTargetTablePath() to use exact matching logic
- Updated DiscoverAndCreateIndexRestoreOperations() to call recursive helper

Files modified:
- schemeshard_impl.h: Added DiscoverIndexesRecursive() declaration
- schemeshard_incremental_restore_scan.cpp: Implemented recursive discovery

Both issues validated and confirmed as correct feedback.
@Enjection Enjection force-pushed the claude/implement-index-restore-011CUsry2xyTV6Y17uNuNUZM branch from d520a69 to 0197f82 Compare November 7, 2025 09:56
Problem:
- After restoring index impl tables, schema version mismatch occurred
- Index metadata expected version X but impl table had version X+1
- Tests failed with: "schema version mismatch during metadata loading"

Root Cause:
- Index impl table restore incremented the impl table's schema version
- Parent index metadata AlterVersion was not updated
- This caused a mismatch when querying the restored table

Solution:
- Added AlterTableIndex operation after each index impl table restore
- Mirrors CDC stream pattern for avoiding additional alterVersion
- AlterTableIndex operation syncs index metadata with impl table version
- Sets index state to Ready, which increments the index AlterVersion

The fix creates two operations per index restore:
1. Restore impl table (increments impl table schema version)
2. AlterTableIndex at parent table (syncs index metadata AlterVersion)

Reference pattern from schemeshard__operation_create_cdc_stream.cpp:
When working on index impl tables, create AlterTableIndex operation
to avoid schema version mismatch between index and impl table.
@Enjection Enjection force-pushed the claude/implement-index-restore-011CUsry2xyTV6Y17uNuNUZM branch from d298e46 to 252b8a1 Compare November 7, 2025 11:06
claude and others added 4 commits November 7, 2025 13:08
Problem:
- Index impl table restore incremented impl table's AlterVersion
- Parent index metadata AlterVersion was not synchronized
- Caused "schema version mismatch" errors during table queries

Root Cause:
- Index restore operations were created independently via Send()
- No coordination between impl table restore and index metadata update
- AlterTableIndex operations executed in wrong order or not at all

Solution:
- Added AlterTableIndex to result vector in CreateRestoreMultipleIncrementalBackups
- Check if destination table parent IsTableIndex() (index impl table pattern)
- If yes, add coordinated AlterTableIndex operation to result
- Both operations execute as part of same parent operation

This mirrors the CDC backup pattern where:
1. Operations are added to result vector for coordination
2. Index impl table operations include corresponding index metadata updates
3. Pattern: check workingDirPath.IsTableIndex(), add AlterTableIndex

For restore, we check dstTablePath.Parent().IsTableIndex() since we're
targeting the impl table directly, not the index path.

Files changed:
- schemeshard__operation_create_restore_incremental_backup.cpp: Added AlterTableIndex to result vector
- schemeshard_incremental_restore_scan.cpp: Removed incorrect Send-based approach
Problem:
- Index impl table restore incremented impl table's AlterVersion
- Parent index metadata AlterVersion was not synchronized
- Caused 'schema version mismatch' errors during table queries

Root Cause:
- Used Parent().IsTableIndex() which returned false for index impl tables
- For index impl table path /Root/Table/Index/indexImplTable:
  - Parent().IsTableIndex() checks if parent PathType is TableIndex
  - This was incorrectly returning false

Solution:
- Changed check to dstTablePath.IsInsideTableIndexPath()
- IsInsideTableIndexPath() correctly detects tables inside index structures
- It skips the impl table level and checks if ancestor is a TableIndex
- Now correctly identifies index impl tables and creates AlterTableIndex

This ensures index metadata AlterVersion stays in sync with impl table.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants